1. Loading dataset
There are 216006 rows and 6 columns
component action target contextid contextlevel contextinstanceid
0 5 10 5 105728 50 1089
1 17 10 6 105844 70 67635
2 5 10 5 105728 50 1089
3 9 10 12 105734 70 67525
4 5 10 5 106176 50 1094
5 5 10 5 106176 50 1094
6 5 10 5 106176 50 1094
7 18 10 6 110766 70 70458
8 5 10 5 105728 50 1089
9 17 10 6 105844 70 67635
component action target contextid contextlevel \
count 2.160060e+05 2.160060e+05 2.160060e+05 2.160060e+05 2.160060e+05
mean 4.825240e-13 -8.526593e-14 -4.700615e-14 -2.279710e-13 4.432454e-13
std 1.000002e+00 1.000002e+00 1.000002e+00 1.000002e+00 1.000002e+00
min -1.634216e+00 -8.719580e+00 -1.488072e+00 -1.180162e+00 -1.083591e+00
25% -8.036008e-01 1.932715e-01 -3.624239e-01 -1.180162e+00 -1.083591e+00
50% 2.701443e-02 1.932715e-01 -3.624239e-01 2.389723e-01 9.228571e-01
75% 2.701443e-02 1.932715e-01 2.004002e-01 1.235776e+00 9.228571e-01
max 2.103553e+00 1.932715e-01 2.733109e+00 1.442257e+00 9.228571e-01
contextinstanceid
count 2.160060e+05
mean -2.905845e-13
std 1.000002e+00
min -9.725989e-01
25% -9.725989e-01
50% -1.993837e-01
75% 7.680869e-01
max 1.670854e+00
2. Reducing via PCA
Explained variation per principal component: [0.54952458 0.20794695]
Cumulative variance explained by 2 principal components: 75.75%
component action target contextid contextlevel contextinstanceid
PC_1 0.472440 0.093180 0.04890 0.470343 0.509519 0.533756
PC_2 0.317474 0.535157 0.75652 0.056370 0.192524 0.015838
*************** Most important features *************************
As per PC 1:
component 0.472440
contextid 0.470343
contextlevel 0.509519
contextinstanceid 0.533756
Name: PC_1, dtype: float64
As per PC 2:
component 0.317474
action 0.535157
target 0.756520
Name: PC_2, dtype: float64
******************************************************************